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ABSTRACT 



A computer system is disclosed which provides for execu- 
tion of real-time code from cache memory. A cache man- 
agement unit provides the real-time code to the cache 
memory from system memory upon a initiation of a read 
operation by a processor. Once in cache memory, the pro- 
cessor executes the real-time code from cache memory 
instead of system memory. The cadie management unit 
detects read hits to cache each time the processor requests an 
instruction of code that is stored in the cache memory. Lock 
bits associated with each line of cache lock the contents of 
the Hne preventing the line from being overwritten under 
normal cache operation in which the least most recently used 
cached data is replaced by presently accessed data. 
Alternatively, one of a plurality of cache data ways may be 
dedicated to storing real-time code. Real-time code stored in 
the dedicated data way is not replaceable and thus is locked. 

17 Claims, 5 Drawing Sheets 
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PROGRAMMABLE CACHE INCLUDING A approximately 5 to 10% of the time available for memory 

NON-LOCKABLE DATA WAY AND A accesses, and typically arc required approximately every 4 

LOCKABLE DATA WAY CONFIGURED TO milliseconds. If the DRAM circuit is not refreshed 

LOCK REAL-TIME DATA periodically, the data stored in the DRAM circuit will be 

5 lost. Thus, memory accesses may be halted while a refresh 

BACKGROUND OF THE INVENTION cycle is performed. 

L Field of the InvenUon Further, most, if not all, computer architectures today 

. 1.. ^ ■ . ■ ^ include multiple bus master systems. Any one of a number 

Tins mvcntion relates to more efficient execution of ^us masters may obtain ownership or control of the 

software m computer systems. More particularly, the present ^^^^^ ^us and thereby access system memory. Normally, 

mvention relates to computer systems for executing software granting a bus master device ownership of the system bus, 

stored in cache memory subsystems. Still more particularly, from among competing requests for ownership, is based on 

the present invention relates to cache subsystems that load a predetermined hierarchy. In a hierarchy scheme, one bus 

real-time event processing software for more efficient cxecu- master device may have a higher position in the hierarchy 

tion. than another bus master device. Accordingly, the former 

2. Description of the Relevant Art device would be granted ownership of the system bus if 

Software to be executed by a microprocessor typically is a conflict between the two devices where each 

stored on a floppy or fixed disk medium. Once a request is ^^f}^^ contemporaneously sought control of the bus. 

made by a user to execute a program, the program is loaded Although hierarchy schemes are valuable for resolving 

• . * > * i5 I- n conflicts between multiple bus master devices requestme 

mto the computer s system memory which usually com- * i c *l u * . i_ . 

, . J , . ^v, . , ^ 20 control or the bus to access system memory, such schemes 

prises dynamic random access memory devices (DRAM). ^^^^ ^ ^^^^ ^^^^ ^^^^ ^ ^^^^^^ 

Tlie processor then executes the code by fetchmg an mstruc- ^^^^^ ^^^^ ^^^^ ^^^^^ ^^^-^ executes its memory 

tioD from system memory, receivmg the instnicUon over a transaction, thereby causing latency with respect to the 

system bus, performing the function dictated by the waiting device 

instruction, fetching the next instruction, and so on. ^5 The latency* associated with memory accesses may be 

Generally, whenever system memory is accessed, there is different and unpredictable from one memory access to the 

a potential for delay between the time the request to memory next. For many software applications unpredictable latency 

is made (either to read or write data) and the time when the is not a significant problem. However, for core sequences, 

memory access is completed. This delay is referred to as especially those related to real-time event processing such as 

"latency" and can limit the performance of the computer. 30 music synthesis which implement digital signal processing. 

There are many sources of latency. For example, opera- unpredictable latency can greatly interfere with proper per- 

tional constraints with respect to DRAM devices cause formance and produce undesirable results, 

latency. Specifically, the speed of memory circuits is based To expedite memory transfers, most computer systems 

upon two timing parameters. The first parameter is memory ^^^^y incorporate cache memory subsystems. Cache 

access time, which is the minimum time required by the 35 niemory is a high-speed memory unit interposed between a 

memory circuit to set up a memory address and produce or ^yf^"^ ^'^^ ^ processor. Cache 

capture data on or from the data bus. The second parameter memory devices usually have speeds comparable to the 

is memory cycle time, which is the minimum time required ^^^^^f P^^^^^ '^''f' ^^f 

u 4 f I' * • • ^ DRAM memory. The cache concept anticipates the likely 

between two consecutive accesses to a memory circuit. For ^^^^ ^ microprocessor of Selected data in system 

DRAM circuits, the cycle time typically is approximately 40 ^^emory by storing a copy of the selected data in the cache 

twice the access tune. DRAM circuits today generally have ^n^mory. When a read request is initiated by the processor 

access times m the approximate range of 60-100 for data, a cache controller determines whether the requested 

nanoseconds, with cycle tunes of 120-200 nanoseconds. information resides in the cache memory. If the information 

The extra tunc required for consecuUve memory accesses m ^ot in the cache, then the system memory is accessed for 

a DRAM curcmt is necessary because the mtemal memory 45 the data and a copy of the data may be written to the cache 

circuits require addiUonal time to recharge (or "precharge^') foj- possible subsequent use. If, however, the information 

to accurately produce data signals. Thus, even a processor resides in the cache, it is retrieved from the cache and given 

running as slow as 10 MHz cannot execute two memory to the processor. Retrieving data from cache advantageously 

accesses m immediate succession (i.e., with adjacent clock fg^ter than retrieving data from system memory, involving 

pulses) to the same 100 nanosecond DRAM chip, despite the 50 both less latency and more predictable latency, 

fact that a clock pulse m such a microprocessor is generated ^ode, as well as data, is subject to being stored in cache, 

every 100 nanoseconds. A DRAM chip requires time to ^ache memory size, however, is generally much smaller 

stabilize before the next address m that chip can be accessed. ^ J^^^^^ ^^^^ 

Consequently, m such a situation the processor must wait by ^.^^^ ^^^^ anticipating the reuse of that 

executmg one or more loop cycles befo^^ 55 infonnation. Because the cache is relatively small and is 

J^T?^ ' ""'"'"P' °°ly storing the most recently accessed code or 

unit MCU ) IS provided as part of the computer system to j.^,^ ^^e or data (i.e., less recenfly used code or data in 

regulate accesses to the DRAM mam memory. Latency ^^^^^^ ^ ^^.^^tten by new code or data, 

caused by long memory cycle tmies relative to processor ^^^^^ ^ replacement generally causes no problem for 

speeds has become a particularly acute problem today as eo t ^f data and code, replacement of real-time code 

processor speeds in excess of 100 MHz are commonplace. detrimentaUy affect the predictabiUty of the latency of 

Instead of waiting one or two clock cycles to again access a a™^^,.^ tr. th^ X^^.! tim^ r^r,^^ «r A.i. tu... ^L^^ 

inA J A\# J • ^ / > uAotty J accesses to tlie real-time code or data and thus may cause 

100 nanosecond DRAM device, today's "486" and "Pen- i^r^rr^^^ 

„ . ^r. It i improper or poor multimedia performance, 
tium processors must wait 20 or more clock cycles. 

In addition to the delays caused by access and cycle times, 65 BRIEF SUMMARY OF THE INVENTION 

DRAM circuits also require periodic refresh cycles to pro- The problems outlined above are in large part solved by 

tect the integrity of the stored data. These cycles consume the teachings of the present invention. The present invention 



11/26/2003, EAST Version: 1.4.1 



5,913,224 

3 4 

relates to a system and method for locking real-time code spirit and scope of the present invention as defined by the 

into a cache memory to avoid repetitively accessing the appended claims. 

real-time code from system memory. A processor reads the DETAILED DESCRIPTION OF THE 

real-time code and the cache subsystem writes the code into INVENTION 

an entry of the cache upon detecting a read miss. An output 5 ™. . r^^^-.i,. 

signal from the processor during the reading of the real-time Jummg now to the drawmgs. FIG 1 is a block diagram 

code indicates to the cache subsystem the real-time nature of °[ » computer system 100 to which the present mvention is 

the code. Id response, the cache subsystem locks the code ±P^''^- ^ Processor 101 couples to a system memory uiut 

into the cache preventing overwriting the code with more ^00 via an U cache subsystem 200. InformaUon stored in 

recently used data. The real-time code is locked into cache lO *5«'*" ^?'°'y is accessible by the processor 101 and 

by setUng a lock bit associated with each line of cache »<=f""^'' data preferably is made available to L2 cache 

containing the real-time code. Once stored and locked into subsystem 200 for temporary storage consistent with com- 

the cache subsystem, the processor fetches instructions for ^^^^l"* techmques. As one of ordinary skill in 

execution. Because the instructions have been stored in the art will recognize, many different types of mformalion is 

cache, the cache subsystem, according to normal cache 15 stored m system memory 300 includmg real-tnne code 320 

protocol, supplies the requested instructions. non-real-Utne code 350. Processor 101 may fetch code 

... , , , . ^ , , from an mtemal cache, L2 cache subsystem 200 or system 

After the processor has completed execution of the real- „„ ., ^nn a-j— u j- . j . lui u 

, , , , .'^ , memory 300. As described m greater detail below, however, 

time code from the cache subsystem, the processor may „ r ui . i j ->-»<»«: 

,. . , . . , , L ■ , processor 101 preferably executes real-time code 320 from 

direct the cache subsystem to unlock the previously fi, t t u u . ,i. .u . -.nn 

, , . , ^ „ r , / . / the L2 cache subsystem, rather than system memory 300, 

executed real-time code to allow for other real-time code 20 , , , ■ j , , • , . n , ■ 

^, , , . thereby reducing deleterious latency effects on real-time 

modules to be executed from cache memory. Unlocking comoutations 

real-time code is accomplished by clearing the lock bits r. * • . ri^-. -» 

associated with the lines containing real-time code. u 1" ^' ^ ^^''^^ 

, , , T . ^ . , . subsystem 200, and system memory 300 is shown m greater 

An alternaUve to scttmg lock bits assoaated with each detail. Consistent with the preferred embodiment, the pro- 

hne of cache contammg real-time code mcludes a real-time ^ jqI includes a CPU core 110 coupled to an internal 

address register that gcneraUy defines which system memory ^ache memory subsystem 115 and a local bus interface 140 

addresses contam real-time code. The register preferably i^^^j ^us 130. Local bus 130 has a predetermined bit 

mcludes a startmg address of the real-time code and a size ^i^th and is the processor's primary bus. Bus interface 140 

value reprcsentmg the number of addresses containing the provides processor 101 with an interface to L2 cache sub- 

real-timc code. The register also mcludes a vahd bit to ^^^^^ 2OO over Unes 160, 161. The CPU core HO also 

mdicate whether the real-Ume locking feature of the inven- includes a translation lookaside table ("TLB") 150. 

tion is turned on or off. When the vahd bit is off, all a n * * j * , iaa i_ j- • 1 

. „ . , 1- 1 . 1 . , . , As illustrated, computer system 10(1 embodies a single 

iniormation, including real-time code, is stored in cache ^^^^^^ i* a u * * • 

J. * , 1 i_ . • TT 1 I- 1 processor. It IS understood, however, that the present mven- 

according normal cache behavior. However, when the vahd f • „ u j + j ^ w * r^nTj 

. . * , . . , ' , ■ , 35 tion may be adapted to multi-processor systems. CPU core 

bit is 00, nonreal-time code is stored in a first way in the ha • j . * ■ 1 * j i 

. J 1 ^. J - . J • J {_ 110 IS a data processmg unit that implements a predeter- 

cache and real-time code is stored m second cache way. „'„^a -^^t * c ™ 1 • ** • 1 j 

T^,,. j^-,-., 1 , mmed mstruction set. Exemplary processmg umts mclude 

Real-time code stored m the second cache way is not omq£. onAoc j n * • • -n. 

I , J - 1 1 J ■ * .u u 11 models 80386, 80486, and Pentium microprocessors. The 

replaced and thus is locked into the cache. To unlock . • /• l u ^ l i- ^ j . .1 

^ A ' u 4- "j. , ' , "1: , present mvention should not be hnuted to any particular 

real-time code m the alternative embodiment, the vahd bits „ • ^ 

■ .t_ 1 . • • X - 1 J 40 processing units. 

m the real-time register is cleared. ^r^^ . , , , , . , 

The TLB 150 generally compnses a cache able set of table 

BRIEF DESCRIPTION OF THE DRAWINGS entries 151 to provide translations between virtual addresses 

and physical addresses, as one of ordinary skiU in the art 

Other objects and advantages of the invention will would know. Normally, a page address is derived fi-om the 

become apparent upon reading the following detailed 45 upper order bits of the virtual address and used to access 

description and upon reference to the accompanying draw- physical page addresses in the TLB. Pages range in size, but 

ings in which: 4K bytes is typical. Also, stored in each table entry are 

FIG. 1 is a block diagram representation of a typical various attributes such as read/Avrite bits 152 for mdicating 

computer system; whether the data stored at the associated physical address is 

FIG. 2 is a block diagram of the computer system con- read-only, write-only, or both. Consistent with the preferred 

sistent with the preferred embodiment for locking real-time embodunent, the TLB 150 mcludes with each entry 151 a 

code into cache* real-time code bit 153 that specifies whether the information 

~ . Lt 1 J- r r J . stored at the associated physical paee address includes 

FIG. 3 IS a block diagram of the preferred computer ^ . Th/r..i.t;n,e i^c. 



system showing the data flow of real-time code into cache; 



real-time code or not. The real-time code bit 153 is written 
55 when the relevant table entry is created by the operating 

FIG. 4 is a flow chart outlining the steps in executing system during commonly known aUocation schemes. It is 

real-time code firom cache memory; and noted that the table entries are only temporarily stored in 

FIG. 5 is a block diagram showing an altemative embodi- TLB 150. A table in memory preferably is used to store table 

ment for locking real-time code in cache. entries. 

Whfle the invention is susceptible to various modifica- 60 The TLB real-time lock bit 153 faciUtates real-time code 
tions and altemative forms, ^ecific embodiments thereof to be stored in L2 cache without being overwritten through 
are shown by way of example in the drawings and will normal cache replacement behavior. Because cache sub- 
herein be described in detail. It should be understood, systems typically overwrite their contents based 00 a least 
however, that the drawing and detailed description thereto recently used scheme, real-time code executed fi-om cache is 
are not intended to fimit the invention to the particular form 65 at risk for being overwritten before it is completely 
disclosed, but on the contrary, the intention is to cover all executed. To avoid overwriting real-time code, the L2 cache 
modifications, equivalents and alternatives faUing within the is able to lock in specified contents. 
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Ttie L2 cache subsystem 200 preferably includes an L2 
cache memory 201 coupled to a cache management unit 202 
for directing the transfers of data into and out of the L2 cache 
memory 201. Cache management unit 202 also controls and 
orchestrates the transfer of data, address and control signals 
between local bus 130 and system memory 300, Cache 
management unit 202 preferably includes a memory con- 
troller for providing access to L2 cache memory 201. The 
memory controller may be any one of a number of com- 
monly known memory controllers compatible with the 
selected CPU core 110 and overall computer architecture. 
Such a memory controller may be located as part of the 
processor 101. Processor 101 also includes a real-time lock 
output signal 161 preferably provided to the cache manage- 
ment unit 202. The real- time lock signal 161 indicates that 15 
the information in system memory 300 requested by pro- 
cessor 101 includes real-time code. This feature will be 
explained in more depth below. L2 cache subsystem 200 
also includes a bus interface 235 which provides an interface 
to system memory 300. 20 

12 cache memory 201 includes a plurality of cache lines 
220. Associated with each hne of L2 cache memory 201 is 
address tag and state information (not specifically shown). 
The address tag indicates a physical address in system 
memory 300 corresponding to each entry within cache 25 
memory 201. In this embodiment each entry within L2 cache 
memory 201 is capable of storing a line of data. A line of 
data preferably consists of four double words 221 (where 
each double word comprises 32 bits). It is understood, 
however, that a line could contain any number of word or 
double words, depending upon the system. It is further 
understood that a double word could consist of any number 
of bits. 

The state information is comprised of a valid bit and a set 
of dirty bits. A separate dirty bit is allocated for each double 
word within each line. A valid bit indicates whether a 
predetermined cache line contains valid cache data, while 
the dirty bits identify the write status of each double word 
within each cache hne. In an invalid state, there is no vahd 
data in the correspoading cache memory entry. In a valid and 
clean state, the cache memory entry contains data which is 
consistent with system memory 300. In a vahd and dirty 
state, the cache memory entry contains valid data which is 
inconsistent with system memory 300, Typically, the dirty 
state results when a cache memory entry is altered by a write 
operation. 

Cache management unit 202 includes an address tag and 
state logic circuit (not specifically shown) that contains and 
manages the address tag and state information. Acomparator 
circuit for determining whether a cache hit has occurred, and 
a snoop write-back circuit that controls the write back of 
dirty data within L2 cache memory 201, It will be appreci- 
ated by those skilled in the art that cache management unit 
202 may contain additional conventional circuits to control 
well-known caching functions such as various read, write, 
update, invalidate, copy-back, and flush operations. Such 
circuitry may be implemented using a variety of configura- 
tions. 

In one embodiment, L2 cache subsystem 200 comprises a 
set associative cache configuration. Least recently used 
replacement may be employed to select one of the ways for 
replacement. 

System memory 300 is a physical memory device of a 
predetermined size and may be implemented with DRAM 
(dynamic random access memory). System memory 300 
may be used to store data, code, and the like. The code stored 



30 



35 



50 



55 



m system memory 300 includes real-time 320. Multiple 
real-time code modules may be stored in system memory 
300. 

Referring still to FIG. 2, cache memory 201 includes a 
plurality of lines of data 220. Preferably associated with 
each line of data is a lock bit 230. The lock bit can be set to 
lock the associated Une of data. Once locked, the fine of data 
cannot be overwritten pursuant to normal cache behavior in 
which the least recently used cache line is overwritten by 
new data to be stored in the cache. The lock bit overrides the 
least recently used replacement scheme for the line associ- 
ated with the lock bit. A "0" value for the lock bit indicates 
that the associated line of data is not locked, where as a lock 
bit value of "1" indicates that the line of data is locked. The 
logic level of the lock bits, of course, can be reversed, i.e. a 
"0" value indicating the associated line is locked and a "1" 
value indicating that the associated line is not locked. For 
purposes of the following discussion, it is assumed that a 
logic "1" lock bit value indicates the locked condition. To 
specify which cache contents to lock, computer system 100 
asserts its real-time lock output signal on hne 161 to indicate 
to the cache management unit 202 when to lock data in 
cache. 

Consistent with the preferred embodiment, generally four 
major steps facilitate the execution of real-time code from 
L2 cache. These steps presuppose that the targeted real-time 
code module has already been written by the operating 
system into the system memory 300 from a disk or other 
mediimi on which the code was stored. First, the processor 
directs the entire block of real-time code to be stored in the 
L2 cache memory 201 while indicating to the cache man- 
agement imit 202 that the information being stored in cache 
comprises real-time code, as opposed to non-real-time code, 
data, or other types of information. Second, the L2 cache 
subsystem 200 locks the real-time code into the L2 cache 
memory 201 to avoid overwriting. Third, the processor 101 
executes the real-time code after it has been stored in 12 
cache memory 201. Lastly, after the processor has com- 
pleted its execution of the real-time code from L2 cache 
memory and no longer needs access to the code, the L2 
cache subsystem unlocks the real-time code freeing up that 
part of L2 cache memory for other real-time code modules. 

Software to be executed by CPU core 110 normally is 
transferred from a disk to system memory 300 and then 
fetched from system memory 300 by CPU core 110 through 
the L2 cache subsystem 200 and lines 160. Consistent with 
the preferred embodiment, computer system 100 takes 
advantage of the L2 cache memory's lock bits, the TLB 
real-time code bit, and the processor's real-time lock output 
signal 161 to allow execution of real-lime code 320 from L2 
cache memory 201, instead of system memory 300. 

Referring now to FIG. 3, a block diagram illustrating the 
flow of data within computer system 100 to transfer real- 
time code 320 from system memory 300 to cache memory 
201 is shown. Real-time code 320 consists of a plurality of 
double words, as exemplified by double words "A" through 
"Z." Before the CPU core 110 executes real-time code 320 
from L2 cache memory 201, the CPU core 110 must have the 
real-time code 320 transferred from system memory 300 to 
L2 cache memory 201. This process preferably is accom- 
plished by a read operation by the CPU code 110 of all of the 
double words in system memory 300 comprising real-time 
code 320. Because the real-time code 320 does not already 
exist in the L2 cache memory 201 when the CPU 110 reads 
the real-time code 320 for the first time, the cache manage- 
ment unit 202 detects a read miss and directs the real-time 
code to be transferred into L2 cache memory 201 pursuant 
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to Donnal L2 cache behavior. A copy of the real-time code 
thus is placed in L2 cache memory 201 as indicated by the 
Unes of L2 cache memory comprising double words "A" 
through "Z". 

These lines of cache containing real-time code ultimately 5 
must be locked to prevent replacement. The 12 cache 
subsystem 200, therefore, must be made aware which of its 
contents include real-time code and which do not. The 
processor 101 provides this indication by asserting the 
real-time lock output signal while reading the code from 
system memory 300. This signal indicates to the cache 
management unit 202 that it must lock the lines of cache in 
which it writes the associated real-time code. 

The following discussion describes how the processor 101 
determines that the code it requests for executing is real-time 
code. As stated, the TLB 150 includes a real-time code bit 
for each entry. Thus, when the processor initiates a read 
request and translates the requested virtual address to a 
physical address by accessing the TLB, the CPU core 110 
reads the associated real-time code bit 153. If the bit is set 
to indicate that the requested information is real-time code, 20 
the processor asserts its real-time lock output signal on line 
161 while also asserting the address and data signals to 
effectuate a read cycle. 

Upon detecting a read miss while the processor's real- 
time lock output signal is asserted, the cache management 25 
imit 202 writes the requested real-time code 320 to L2 cache 
memory 201 and sets the lock bit associated with that line to 
a logic "1" indicating that this line of cache memory cannot 
be overwritten by subsequent cache replacement activity. 
Altematively, the cache management 202 may wait until the 30 
entire real-time code 320 is stored in L2 cache memory 201 
before setting all of the lock bits to a logic "1** level. Once 
the real-time code is completely stored in L2 cache memory 
201 and all of the associated lock bits are set, the CPU core 
110 then can execute the real-time code 320. At this point, 35 
as one of ordinary skill in the art will readily understand, it 
is transparent to the CPU core 110 that execution of the 
real-time code is from 12 cache memory 201 instead of from 
system memory 300. The CPU core 110 fetches each 
instruction of the real-time event handler by issuing physical 40 
addresses pertaining to locations in system memory 300 of 
the real-time code 320. The cache management unit 202, 
however, detects a read hit as the requested instruction of the 
real-time code is also stored in L2 cache memory 201. In 
response, the cache management unit 202 directs the 45 
requested instruction to be supptied to the CPU core 110 
from L2 cache memory 201 instead of from system memory 
300. In this manner, the real-time event handler is executed 
by the CPU core 110 from L2 cache memory 201 with 
reduced latency and increased latency predictability. 50 

Once the real-time code 320 is completely executed, it 
may be desired to unlock the real-time code 320 from L2 
cache memory, thus freeing up cache entries for other code 
or data. Cache lines are unlocked by changing the state of the 
lock bit associated with the targeted Unes. In computer ss 
systems consistent with the preferred embodiment, invati- 
date or flush operations preferably are used to unlock cache 
entries. Invalidating the cache preferably is initiated upon 
operating system reallocation of the page corresponding to 
the real-time code, as one of ordinary skill in the art would 60 
imderstand. For example, when a page is selected for 
reallocation, if the current translation to the page has the 
real-time code bit set, the operating system may execute a 
flush operation to each line in the page. L2 subsystem 200 
resets the lock bit for the corresponding line. Alternatively, 
a flush operation indicated to be a real-time operation via 
real-time signal 161 may cause all of the lock bits to be reset. 



,224 

8 

FIG. 4 shows a flow diagram exemplifying a method 
consistent with the preferred embodiment for executing 
real-time code from L2 cache memory. In step 405, the 
real-time code module to be executed is selected and copied 
into the system memory by the operating system (step 410). 
Page allocations and TLB entries in step 415 are updated and 
if the code copied into system memory is a real-time 
module, the real-time bit in the corresponding TLB entry is 
updated. The CPU determines whether the code is real-time 
or not in step 420 by accessing and checking the state of the 
real-time code bit corresponding to the real-time code. If the 
code is real-time, the CPU accesses the entire code in step 
435. Upon detecting a read miss in step 440, the L2 cache 
subsystem writes the real-time code into cache and locks the 
associated lines. In step 445, the CPU may then execute the 
real-time code by fetching instructions which are provided 
by the L2 cache where the code is stored. Finally, in step 
450, the lines of cache memory that were used to store the 
real-time code are unlocked as described above. If however, 
±c code is not real-time (step 420), the CPU fetches the code 
from memory and executes it according to known protocols. 

It is noted that while the 12 cache in FIGS. 1-3 is 
interposed between the processor 101 and system memory 
300, other cache configurations are possible. For example, 
the cache may comprise a look aside configuration or 
backside cache configuration in which a system memory bus 
couples the CPU, cache, and system memory. 

Referring now to FIG. 5, an alternative embodiment for 
locking real-time code into cache is shown to comprise L2 
cache subsystem 200 coupled to processor 101 over lines 
160 and to system memory 300. L2 subsystem 200 includes 
a processor interface 250, tag logic 255, memory interface 
295, data way 0, data way 1, multiplexer 290, comparators 
260 and 285, real-time address registers 265, and replace- 
ment logic 287. Processor interface 250 couples to tag logic 
255, data way 0, and comparators 260, 285 over lines 252, 
Tag logic25S provides tag information to comparator 260 
which compares the address signals provided by processor 
interface 250 to the tag information provided by tag logic 
255 to determine the existence of a cache hit or miss, as one 
of ordinary skill in the art would know. Comparator 260 
provides an output signal on line 262 to multiplexer 290 and 
another output signal on line 263 to replacement logic 287. 
The output signal provided to multiplexer 290 is asserted 
upon detecting a cache hit by comparator 260 to select the 
requested data from the data way that contains the requested 
data. The output signal on line 263 indicates the presence of 
a cache miss to replacement logic 287. 

Memory interface 295 directs the operation of data way 0 
and data way 1 via lines 297 dining cache hits and misses 
and also provides communication with system memory 300 
over lines 296 for retrieving data from system memory 300 
to be stored in one of the two data ways. Although only two 
data ways arc shown in FIG. 5, one of ordinary skill in the 
art will recognize that the invention could include additional 
data ways. The output signals from data way 0 and data way 
1 over lines 272, 282, respectively are provided to multi- 
plexer 290. Multiplexer 290 is a known 2:1 multiplexer in 
which one of two input signals is provided as an output 
signal in response to the state of a control signal. During a 
cache miss, multiplexer 290 is controlled by the output 
signal from comparator 260 on line 262. The signal on line 
262 detertnines which of the two input signals on lines 272 
and 282 are to be selected by multiplexer 290 as an output 
signal on line 292. The output signal of multiplexer 290 is 
provided to processor interface 250. 

During a cache miss, memory interface 295 is controlled 
by replacement logic 287 to store data corresponding to the 
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new address (Lc, the address for which there was a cache 
miss) in one of the data ways. Memory interface 295 
retrieves the requested data from system memory over lines 
296 and stores the data in one of the two data ways as 
determined by replacement logic 287 in accordance with the 
present invention. 

Real time address register 265, although shown as a single 
register in FIG. 5, may include multiple registers. Each 
register preferably includes a real-time address field (RTA), 



provided by processor interface 250 is not currently stored 
in cither data way), a signal on line 263 directs the replace- 
ment logic 287 to store data corresponding to that address in 
one of the data ways. The V bits from register 265 are also 
provided to replacement logic 287 on line 266 and thus 
replacement logic 287 can determine that the locking feature 
of the invention is disabled. In this situation (cache miss, V 
bits clear), the replacement logic 287 stores the data corre- 
sponding to the address from processor interface 250 in 
either data way in accordance with known protocols. The 



a valid bit field (V), and a size field (Sz) associated with a lO ^^^^ ^e stored in cache is retrieved firom system memory 



real-time code module. The real-time address field prefer 
ably includes the starting address of the real-time code. The 
size field indicates the size of the real time code whose 
starting address is specified in the RTA field of register 265. 
The RTA and Sz fields thus specificy the location of real- 
time code in system memory. As explained below, by 
comparing and address from processor 101 to the contents of 
register 265, comparator 285 can determine whether the 
address is an address pertaining to real-time code. The V bit 



300 over lines 296 by memory interface 295 and stored in 
the selected data way. Replacement logic 287 may use in any 
commonly known replacement algorithm such as the least 
recently used algorithm in which the least recently used 
datiun in the data ways is replaced by the new data. Tag logic 
255 is then updated by memory interface 295 to include the 
tag associated with the new data stored in the data ways. 
In the second situation, at least one of the valid bits in 



registers 265 is set indicating that the real-time locking 

indicates whether the real time code locking feature of the 20 feature of the present invention is enabled, and a cache miss 

present invention associated with the real time code begin- occurs. A cache miss for a real -time code address results in 

ning at the RTA address is enabled or disabled. Thus, a V bit storage of the associated real-time code in data way 1. 

that is set indicates that the real-time code locking feature is Comparator 260 compares the address to tags from tag logic 

turned on (enabled) for the real-time code specified by the ^^5 and indicates the existence of a cache miss on line 263 

RTA and Sz fields. Conversely, the real-time code locking 25 to replacement logic 287. Comparator 285 con^quently 



feature can be turned off (disabled) by clearing the V bit. The 
contents of the real-time address register preferably is ini- 
tiated by the computer's operating system or a device driver 
as one of ordinary skill will recognize. 

It should be recognized that the contents of register 265 
generally define which system memory addresses contain 
real-time code. Thus, register 265 cotild be configured 
differently than that described above. For example, register 
265 could include a beginning and an ending address instead 
of a beginning address and a size value. 

The contents of real-time address register 265 is provided 
to comparator 285 which compares the contents of the 
real-time address register 265 to the address provided by 



35 



compares the address from processor interface 250 to the 
range of real-time addressed specified by registers 265 and 
determines that the address from processor interface 250 
falls within the range of real-time addresses. Comparator 
285 provides a signal on line 288 to replacement logic 287 
indicating that the new address pertains to real-time code. In 
response, replacement logic 287 directs the memory inter- 
face 250 to retrieve the real-time code associated with the 
current address. After retrieval of the real time code, the 
real-time code is stored in data way 1 without replacing any 
other real- time code already stored in data way 1. As 
explained previously, data way 1 is dedicated to the storage 
of real-time code when a V bit is set. Replacement logic 287 
and memory interface 295 cooperate to prevent any real- 



processor interface 250. Comparator 285 provides an output ^ time code from being replaced when the real-time code 



signal to replacement logic 287 over lines 288 to indicate 
whether the address from processor interface 250 is an 
address corresponding to real-time code, or not. Replace- 
ment logic 287 provides control signals to memory interface 
295 over lines 294 generally for directing the storage into 
cache of real-time code in accordance with the present 
invention. 

The operation of the alternative embodiment shown in 
FIG. 5 will now be described with reference to four 



locking feature is enabled. 

The third situation is similar to the second situation except 
that the address received by processor interface 250 is not an 
address for real-time code. Comparator 260 detects a cache 
45 miss and comparator 285 determines that the new address 
does not lie within the real-time address range specified by 
registers 265, and that at least one V bit is set indicating that 
the real-time code feature of the present invention is 
enabled. Comparator 285 indicates to replacement logic 287 



situations — (1) valid bit not set, cache miss, (2) valid bit set, 50 on fines 288 that the address is not a real-time code address, 

cache miss, and address within the real-time address range In response, replacement logic 287 directs the memory 

specified by register 265, (3) valid bit set, cache miss, and interface to retrieve the data corresponding to the address 

address not within real-time address range specified by from system memory 300 and store it in data way 0 

register 265, and (4) cache hit. In the second and third preferably according to the least recently used algorithm 

situations, it will be seen that data way 1 is used to store 55 described previously. 

real-time code and data way 0 is used to store nonreal-time In the fourth situation comparator 260 detects a cache hit 

code. However, the selection of which data way to use for upon comparing the address from processor interface 250 

storing real-time code is not important. Thus, data way 0 and tags from tag logic 255. The output signal from com- 

could be used to store real-time code. parator 260 on line 262 is asserted indicating the presence of 

In the first situation in which a cache miss is detected and 60 a cache hit and also indicates in which of the data ways the 

the V bits of registers 265 are cleared indicating that the requested data is located. Multiplexer 290 uses this output 

real-time code locking featuire of the present invention has signal as a control signal and provides on its output lines 292 

been turned off, L2 cache subsystem 200 functions in the data from the data way specified by the state of the 

accordance with known cache protocol. An address provided control signal. The requested data is provided to processor 

to processor interface 250 is compared against the tags 65 101 through processor interface 250. 

stored in tag logic 255. Upon detection of cache miss by It should be recognized that the real-time code locking 

comparator 260 (i.e., the data corresponding to the address feature of the present invention can be disabled by simply 
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clearing all of the V bits in registers 265. Ooce the V bits are 
cleared, cache storage proceeds in accordance with known 
protocols and new data can be stored in either data way 
according to, for example, the least recently used method. 

Numerous variations and modifications will become ^ 
apparent to those skilled in the art once the above disclosure 
is fully appreciated. For example, cache subsystems internal 
to the processor (so called "LI caches'*) may be used for 
storing, locking, and executing real-time code. It is intended 
that the following claims be interpreted to embrace all such lO 
variations and modifications. 

What is claimed is: 

1. A computer system for executing code from cache 
memory comprising: 

a processor for executing code; 
a system memory device for storing code and data; 
a system bus coupling said processor and said system 
memory; 

a cache memory subsystem coupled to said system bus, 20 
wherein said cache memory subsystem includes a plu- 
rality of data ways, wherein each data way includes a 
pluraMty of cache entries each configured for temporary 
storage of a line of code or data, wherein the cache 
entries of at least one of said plurality of data ways are 25 
lockable to lock real-time code therein, and wherein the 
cache entries of at least another of said plurality of data 
ways are Dot lockable. 

2. The computer system of claim 1, wherein said proces- 
sor includes a translation lookaside buffer (TLB) which 30 
includes a real-time code bit which indicates whether infor- 
mation stored in said system memory device corresponding 

to the real-time code bit comprises real-time code. 

3. The computer system of claim 2, wherein said proces- 
sor is configured to assert a real-time code output signal. 35 

4. The computer system of claim 3, wherein real-time 
code is stored and locked in said at least one of said plurality 
of data ways of said cache memory subsystem upon asser- 
tion of said real-time code output signal by said processor 
during a read operation by said processor. 40 

5. The computer system of claim 4, wherein said cache 
memory subsystem upon storing said real-time code in said 
at least one of said plurality of data ways of said cache 
memory subsystem, locks said lines of cache in which said 
real-time code is stored in response to receiving said real- 45 
time code output signal from said processor. 

6. A method of executing real-time code from cache 
memory, wherein said cache memory includes a plurahty of 
data ways, wherein each data way includes a plurality of 
cache entries each configured for temporary storage of a line 50 
of code or data, wherein the cache entries of at least one of 
said plurahty of data ways are lockable to lock real-time 
code therein, and wherein the cache entries of at least 
another of said plurality of data ways are not lockable, the 
method comprising the steps of: 55 

(a) updating entries in a TLB for translating virtual 
addresses associated with real-time code to physical 
addresses; 

(b) further updating a real-time code bit to ascertain if 
code to be executed comprises real-time code; 

(c) reading said real-time code bit to ascertain if code to 
be executed comprises real-time code; 

(d) a processor reading said real-time code if said real- 
time code bit indicates the presence of real-time code; 
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(c) storing said real-time code in one of said at least one 
of said plurality of data ways of said cache memory; 

(f) locking said real-time code into said one of said at least 
one of said plurality of data ways of said cache memory 
to prevent overwrites; and 

(g) executing said real-lime code from said cache 
memory. 

7. The method of claim 6, wherein the step of locking said 
real-time code into said cache memory includes setting a 
lock bit associated with each line of cache memory contain- 
ing real-time code. 

8. The method of claim 6, wherein said real-time code is 
unlocked from said cache memory after execution of the 
real-time code. 

9. The method of claim 6, wherein said executing said 
real-time code comprises said real-time code which operates 
on real-time data including multimedia data. 

10. A cache system for storing and locking real-time code, 
comprising: 

a first cache data way that is not lockable; 

a second cache data way in which real-time code is 
lockable when stored therein; 

a memory interface coupled to said first data way and said 
second data way; 

a real-time address register that includes addresses asso- 
ciated with real-time code stored in said second cache 
data way; and 

a comparator that compares an address received by said 
cache system with the contents of said real-time 
address register to determine if said address corre- 
sponds to real-time code. 

11. The cache system of claim 10 wherein said real-time 
address register includes a starting address of said real-time 
code. 

12. The cache system of claim 11 wherein said real-time 
address register further includes a size value indicating the 
size of the said real-time code. 

13. The cache system of claim 12 wherein said real- time 
address register further includes a valid bit indicating 
whether said real-time code is to be locked in said second 
cache data way of said cache system. 

14. The cache system of claim 13 further including: 

a memory interface coupled to said first cache data way 

and said second cache data way; and 
replacement logic coupled to said memory interface and 

said comparator. 

15. The cache system of claim 14 wherein said compara- 
tor provides a signal to said replacement logic, wherein said 
signal indicates whether said address received by said cache 
system is an address included in said real-time address 
register. 

16. The cache system of claim 15 wherein said valid bit 
from said real-time address register is provided to said 
replacement logic, wherein said replacement logic directs 
said memory interface to store real-time code exclusively 
into said second data way if said valid bit received from said 
real-time address register is set and said signal from said 
comparator indicates that an address received by said cache 
system is an address included in said real-time address 
register. 

17. The cache system of claim 10, wherein said real- time 
code operates on real-time data including multimedia data. 

)K * * * * 
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